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The rapid development of 3D acquisition technology has made it possible to obtain 
point clouds of real-world terrains. However, due to limitations in sensor acquisition 
technology or specific requirements, point clouds often contain defects such as holes 
with missing data. Inpainting algorithms are widely used to patch these holes. How- 
ever, existing traditional inpainting algorithms rely on precise hole boundaries, which 
limits their ability to handle cases where the boundaries are not well-defined. On the 
other hand, learning-based completion methods often prioritize reconstructing the entire 
point cloud instead of solely focusing on hole filling. Based on the fact that real-world 
terrain exhibits both global smoothness and rich local detail, we propose a novel rep- 
resentation for terrain point clouds. This representation can help to repair the holes 
without clear boundaries. Specifically, it decomposes terrains into low-frequency and 
high-frequency components, which are represented by B-spline surfaces and relative 
height maps respectively. In this way, the terrain point cloud inpainting problem is 
transformed into a B-spline surface fitting and 2D image inpainting problem. By solv- 
ing the two problems, the highly complex and irregular holes on the terrain point clouds 
can be well-filled, which not only satisfies the global terrain undulation but also exhibits 
rich geometric details. The experimental results also demonstrate the effectiveness of 
our method. 

© 2024 Elsevier B.V. All rights reserved. 


1. Introduction 


existing traditional point cloud inpainting algorithms are not ca- 
pable of handling these holes. Specifically, they often require 


With the rapid advancements of modern sensor technology, 
it has become feasible to acquire point clouds of real-world ter- 
rains at an acceptable cost. However, holes may appear on the 
terrain due to the removal of undesirable objects or occlusion. 
As shown in Figure [I] these holes may have irregular shapes, 
even without well-defined boundaries. 

Currently, both traditional hole-filling algorithms and deep 
learning-based completion algorithms are dedicated to gener- 
ating new point clouds from local point clouds. Unfortunately, 
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a precise description of hole boundaries |I] P], and can only 
effectively operate on simple hole boundaries. However, the 
boundaries become so intricate that accurate determination in 
math becomes difficult for terrain point clouds, even unfeasible 
to define the boundaries. As a result, these algorithms cannot 
be effectively applied in this situation. 


Moreover, while deep learning-based algorithms for point 
cloud completion demonstrate efficient performance on public 
release datasets such as ShapeNet-Part B], PCN and Com- 
pletion3D [5], they have challenges in handling terrain point 
clouds. Firstly, these methods are limited by the size of the 
neural network, making it difficult to generate a large number 
of points that match the scale of terrain point clouds. Sec- 


2 Preprint Submitted for review /Computers & Graphics (2024) 


(a) Raw data 


(b) Holes with irregular shapes 


Fig. 1. In outdoor scenes, the combination of trees and low-rise buildings 
often presents significant challenges for terrain reconstruction using a 3D 
acquisition technique called oblique photography. Upon filtering out these 
extraneous objects, complex-shaped or no-well-defined holes remain. (a) 
An overhead perspective of the terrain point cloud. (b) After applying 
filters, both vegetation and low-rise buildings are effectively eliminated, 
thereby exposing holes within the terrain. Irregular boundary holes are 
marked with blue boxes, while holes lacking well-defined boundaries are 
marked with green boxes. 


ondly, these methods generate interpolated points through en- 
coding and decoding the partial point cloud as a latent repre- 
sentation [4]. Consequently, there is a tendency to reconstruct 
the entire point cloud instead of solely filling in the missing 
parts, resulting in a general reconstruction that lacks specific 
details [6]. 

In this paper, we propose a novel method to repair complex- 
shaped holes on terrain point clouds. It is built on a new repre- 
sentation of terrain point clouds based on the fact that terrains 
can be treated as a combination of low- and high-frequency 
components, which represent the terrain undulation and geo- 
metric details respectively. With respect to the low-frequency 
components, we use B-spline surfaces to fit them. In addition, 
we build a relative height map based on the B-spline surfaces 
to represent the high-frequency components. Based on this new 
representation, the terrain point clouds inpainting problem is 
converted to a B-spline surface fitting problem and an image 
inpainting problem. Obviously, it is easier to fix these two 
problems than to patch complex-shaped point cloud holes di- 
rectly. In addition, this method preserves terrain undulation 
and geometric details well by repairing both the low- and high- 
frequency information. 

In summary, the main contributions of this paper are as 
shown below: 


e We propose a new representation for terrain point clouds. 
It decomposes the terrain point clouds into low- and high- 
frequency components, which represent the terrain undu- 
lation and geometric details respectively. 


e We design an automated hole location method. Specifi- 
cally, we give a unique definition for holes and directly 
locate the position of the holes. In this way, the detection 
of complex hole boundary points is not required. 


e We fit the terrain point cloud to a B-spline surface and 
treat it as the low-frequency component. The use of the 
B-spline surface can help fully represent the local struc- 
tural features of the 3D point cloud in its parameter space 


and avoid geometric loss in the format transformation. 


The remainder of the paper is organized as follows. Section[2] 
describes the related work. An overview of our method is pre- 
sented in Section B] Section 4]describes the methods for con- 
structing and inpainting low-frequency component. Section [5] 
presents our method of inpainting high-frequency components. 
Section |6]introduces the process of reconstructing point cloud 
signals. Section [7] provides a more detailed description when 
handling large-scale point clouds. The experimental results are 
shown in Section [8] The Discussion is provided in Section | 
Lastly, conclusions are given in Section[I0] 


2. Related Work 


Currently, both deep learning-based methods and traditional 
methods have reached a mature stage. Deep learning-based 
point cloud completion methods have made rapid advance- 
ments, encompassing point-based [7], convolution-based [8], 
graph-based [9], transformer-based [10], and generative model- 
based [II]. These techniques have demonstrated impres- 
sive results on existing datasets in the CAD field, such as 
ShapeNet [B], ModelNet [12], and Completion3D [5]. How- 
ever, their robustness is limited when confronted with real- 
world scenarios that lack an abundance of actual data. Further- 
more, the constraint of GPU memory hinders direct processing 
of large-sized point clouds using deep learning methods. Vox- 
elization preprocessing, although utilized, can lead to the loss 
of essential information within the point cloud. Despite the ex- 
istence of feature extraction networks like PointNet [13], there 
is a need to further emphasize feature learning [14]. 

Based on the data format, existing traditional point cloud 
inpainting methods can be broadly classified into two cate- 
gories: mesh-based inpainting methods, and point-based in- 
painting methods. Although mesh format applied in mesh- 
based inpainting methods offers a natural advantage in automat- 
ically detecting hole boundaries due to its inherent topology, the 
geometric loss incurred during the inpainting and format trans- 
formation process is unacceptable. 

Mesh-based inpainting methods. Generally, these methods 
firstly grid point clouds, and then detect hole edges based on 
mesh topology. Lastly, they use the geometric structure and 
topological characteristics of the neighborhood to interpolate 
the hole area [16]. For example, Wei er al. applied 
a minimal triangulation to holes and iteratively subdivided the 
patching meshes in order to assort with the density of surround- 
ing meshes. However, mesh interpolation can become trapped 
in planar hole-filling content [18], resulting in a lack of detail. 
In addition to these grid-based methods for filling holes in gen- 
eral models, there are also unique methods for oblique photo- 
graphic data. For example, Wang et al. divided the point 
cloud space into cells and individually repaired each cell to ac- 
celerate the hole-filling process. However, this method encoun- 
ters precision issues when the reconstructed mesh is rough and 
performs poorly when dealing with large holes. 

Point-based inpainting methods. These point cloud in- 
painting methods work directly on point clouds, avoiding the 
costly process of format transformation. One common solution 
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Fig. 2. Core idea of our method. 


for inpainting is the patch-based matching scheme. For exam- 
ple, Huang et al. detected holes directly in point clouds 
and then iteratively searched for context with local similarity, 
propagating it appropriately along the local geometric structure 
of occlusion. Dinesh et al. filled holes in an iterative way. 
Specifically, they gave boundary points priority determining the 
order in which they were processed, and used templates near the 
hole boundary to find the best matching regions. Additionally, 
Shi et al. considered the normal features of point clouds for 
patch matching and replacing. Some works attempt to simplify 
the problem of detecting holes and inpainting by adjusting the 
representation of unorganized point clouds. For example, Doria 
etal. projected points onto a depth map from a certain per- 
spective and then used an image-based method to complete the 
inpainting and reconstruct 3D point clouds from the inpainted 
image. Hu et al. organized point clouds with a graph 
and proposed a method that exploits both the local smoothness 
and non-local self-similarity, which also relies on the depth map 
to find holes. However, while these methods introduce new rep- 
resentations and work well in many cases, they can lead to geo- 
metric losses in constructing depth maps when facing complex 
scenarios. Altantsetseg et al. created contour curves inside 
the boundary edges of the hole to add locally uniform points to 
the hole, but this method heavily relies on accurate hole bound- 
aries. 


3. Method Overview 


Our method proposes a new point cloud representation for 
terrain, based on the important fact that terrain possesses glob- 
ally smooth with rich local geometric details [26]. Depending 
on this representation, we have the capability to process terrain 
point clouds of arbitrary shapes and the problem of filling holes 
in 3D point clouds can be primarily transformed into a 2D im- 
age inpainting problem. Moreover, there is no longer a need to 
extract hole boundary points as required by previous methods; 
instead, we can directly locate the hole regions. 

As shown in Figure [2] the core idea of our method is to de- 
compose terrain point clouds C into a low-frequency compo- 
nent L and a high-frequency component H, where L describes 
the basic shape of the terrain and H captures the details based 
on L. It is worth mentioning that DEM (Digital Elevation 


Model) in geographic information systems can also be appli- 
cable within our new representation. As illustrated in Figure B] 
it could be treated as the simple form of the above decomposi- 
tion, where the XY plane is fixedly chosen as L and the elevation 
is considered as H. However, XY plane is not a good choice in 
every situation to represent the low-frequency information of 
terrain because it provides no information about the terrain un- 
dulations. 


(a) DEM Method (b) Ours 


Fig. 3. A comparison is made between the DEM method and our method 
from a lateral perspective. In this comparison, the original signal is de- 
picted in green, the low-frequency component L is represented in blue, and 
the high-frequency component H is depicted in yellow 


. (a) illustrates the DEM method, where the XY plane 
corresponds to the ground plane and represents L, while the 
height of the original signal above the ground represents H. 

(b) illustrates our algorithm, where the fitting B-spline surface 
is considered as L, while the distance between the original 
signal and the surface represents H. 


As shown in the Figure |4| the proposed method consists of 
three stages and the detail description of them is shown below: 


e Low-frequency Component Inpainting Stage: In this 
stage, we utilize a B-spline surface as the low-frequency 
component to fit the 3D point cloud. Once the fitting pro- 
cess is complete, we consider the inpainting of the low- 
frequency component to be finished. 


e High-frequency Component Inpainting Stage: We gen- 
erate the high-frequency component by projecting each 
point to the B-spline surface, which involves finding the 
nearest point on the surface and calculating the projection 
distance. We record the projection information above to 
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Fig. 4. The pipeline of our method. 


construct a high-frequency image (i.e., height map) after 
rasterizing. Then, we fill holes in the height map by solv- 
ing the Poisson equation, guided by patch matching in the 
gradient domain. 


e Reconstruction Stage: We compute offset distances away 
from the low-frequency B-spline surface by sampling in 
the high-frequency image, and then resolve 3D points by 
superimposing these two components to complete the re- 
construction process. 


4. Low-frequency Component Inpainting 


B-spline surface fitting is a well-established technique with a 
solid theoretical foundation and logical framework. It has been 
widely used in the field of reverse engineering [28]. Gen- 
erally, B-spline surface S(u, v) can be defined as a network of 
tensor product B-spline surface as shown in Equation (I): 


S(u, v) = = >» Nia WN ja, (V)Bi j (1) 


i=0 j=0 


where B denotes a set of m + 1 rows and n + 1 columns con- 
trol points, and B; j € R? represents the control point at i-row 
and j-column. Njq,(u) and N;a,(v) are B-spline basis func- 
tions of degree d, and d,, defined over the knot sequences u = 
(Up, 41,°** , Um+a,) in the u-direction and v = (vo, V1,°** ,Vn+a,) 
in the y-direction respectively, with local coordinates (u,v) € 
[0, 1}. 

The goal of B-spline surface fitting is to identify the control 
points B and knot sequences (u, v) that minimize the distance of 
C to S, as shown in Equation 2): 


Ne 
Doss = X dpi, SP (2) 
i=1 


where Nc is number of points in C. d(p;, S) represents the dis- 
tance of discrete point p; in C to B-spline surface S which can 
be calculated by Equation (3): 


d(pi,S) = min || pi- S(ui,vi) IP (3) 


(uj,vi)€[0, 1]? 


where (u;, vi) is the parametrization of the projection of p; onto 


S. 


To simplify the fitting problem, (u, v) is obtained by uniform 
parameterization. Obviously, Equation (2) is a nested optimiza- 
tion problem, and can be naturally solved by an iterative op- 
timization method. Specifically, each iteration is composed of 
two steps: 


e Fitting step. For the given (u;,v;) of projection point of 
pi, the best B are obtained by solving a linear least squares 
problem. 


e Correcting step. For the given B, the best (u;, v;) of pro- 
jection point of p; are obtained by projecting p; onto the 
given S. 


It is worth noting that the B-spline surface fitting method 
mentioned above is based on iterative local optimization. There 
is no guarantee of convergence to a global optimum. The suc- 
cess of obtaining an optimal approximation depends on a good 
choice of the initial B-spline surface shape. However, if the tar- 
get shape is relatively uncomplicated, a simple initial surface 
is generally enough to achieve stable convergence. Coinciden- 
tally, the natural terrain point clouds in our task are not partic- 
ularly complex from an overall perspective. Therefore, we use 
a simple method of initializing B by building an octree [29], 
which balances both time cost and fitting quality. After initial- 
ization, only a few iterations are typically necessary to signifi- 
cantly improve the fitting accuracy. 

After getting B by the above iteration optimization method, 
we can obtain the fitted B-spline surface L, i.e., the inpainted 
low-frequency component. 


5. High-frequency Component Inpainting 


5.1. High-frequency Component Construction 


Once the low-frequency component L has been constructed, 
the calculation of the high-frequency component H follows nat- 
urally. To be more precise, H corresponds to the signed distance 
of C projected onto S. 


5.1.1. Projection Point Calculation 

If a point qi (qi = S(u;,v;)) on the surface minimizes the 
distance d(p;, gi), we consider q; as the projection of p; onto the 
surface. In order to calculate g;, Newton’s method is employed 
to solve this problem. Specifically, the coordinates (u}, v;) that 
minimize d(p;, S(u;, v;)) satisfy the condition d’(p;, S(uj, vi)) = 
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0. To find the root of this equation, we can utilize Newton’s 
method, which involves the iteration formula: 


d' (Dis T uem yom) we 0 1.2.-+: 
ad” (Di, Tuem yom) ‘ Ta 

(4) 
Here, m represents m-th iteration. This method exhibits 
quadratic convergence, indicating that Newton’s method is gen- 
erally faster than quadratic minimization and can converge to a 
solution within a few iterations [BO]. Initially, the method may 
have a slower start, but it progressively approaches the optimal 
value and rapidly converges to the final solution d(p;, q;), where 


q; represents the optimal projection point of pj. 


` 


alate prety = (u smj = 


5.1.2. Projection Point Initialization 

It is important to note that the choice of initial values in 
Newton’s method significantly influences the efficiency of con- 
vergence. To address this, we propose a method for calculat- 
ing the initial projection point CCH = S, v?)) prior to ex- 
ecuting iterations, thereby accelerating the subsequent steps. 
Firstly, we sample S with a uniform 2D sampling sequence 
{(uo, vo), (Uo, v1), - - - , (Uz, Vy)}, Where (ux, vi) = (£, £), a € [0,2], 
b € [0,7]. These sample points are represented as S(ux, vi). 
Secondly, for each data point p;, we calculate the distances 
(pi, Sug, v) =|| pi — S(ug, vı) || from p; to S(ug, vı). Sub- 
sequently, we select the minimum d(p;, S(ux, vi)), denoted as 
d(p;, S(u;, v;)) and consider q? = S(u;,,v;) as the initial projec- 
tion point of p;. 


5.1.3. Sign Determination of Projection Distances 

To ensure accurate generation of the point cloud in the subse- 
quent reconstruction step, it is crucial to differentiate between 
the two potential directions, as depicted in Figure |5| In sim- 
pler terms, we utilize the normal vector ñ}, at q; and the vector 
from q; to p;, denoted as fg, - Rg,+p, to determine this distinc- 
tion. Theoretically, since q; is the closest point to p; on S, the 
line that encompasses #4, must pass through p;, resulting in two 
possibilities for the orientation of ñ}, and R,,),: they can ei- 
ther be parallel or anti-parallel. However, considering that in 
Section 5.1.1] we obtained q; through a finite number of itera- 
tions using Newton’s method, it becomes necessary to account 
for the presence of an error e. In other words, the condition that 
holds true is: 


1- litg, *Rgp)| < € (5) 


where ñ, and Ry,+,, represent unit vectors. Additionally, we 
can assume that q; are highly reliable, indicated by |i#,,-Rg,5p,| > 
0. 

Therefore, the calculation of ñ}, is necessary as S itself does 
not include normal information. To achieve this, we gather all 
qi aS a new point cloud and construct a K-nearest neighbors 
(K-NN) graph for them. Subsequently, we employ Principal 
Component Analysis (PCA) to estimate their respective 
normals ñ4,. Since the orientation of /i,, can be ambiguous, a 
broadcasting method is applied as a post-processing step to en- 
sure consistent orientation across all 7,,. Once this is done, we 


P2 


Fig. 5. Illustration of point projection onto a surface. qı and q2 represent 
the projection points of pı and p2, respectively. pı and pz reside on oppo- 
site sides of S and exhibit different projection directions. 


can proceed to calculate the sign of the projection distance de- 
noted as w: 


3 
_ fg’ Rap; 


(6) 


S litg, Rg pil 
Furthermore, the calculation of the signed distance, specifically 
the high-frequency components H, can be obtained using the 
following procedure: 


H = Dy, = w; X d(pj, qi) ) 


5.2. Height Map Construction 


Considering the inconvenience of directly locating holes 
from projection information, which hinders the high-frequency 
component inpainting, we construct a height map to achieve 
this purpose. By projecting point cloud C onto S and gener- 
ating projection points q; (qi = S(ui, vi)) for each correspond- 
ing point p; in C, we establish a one-to-one correspondence be- 
tween (u;, vi) and p;. This means that the projection information 
in the two-dimensional UV space can reflect the local neighbor- 
hood relationships in the three-dimensional point cloud space. 
While directly locating holes or extracting boundaries in the 
continuous two-dimensional space is not convenient, our pri- 
mary interest lies in determining the hole positions rather than 
identifying the points that belong to the hole boundaries. There- 
fore, we discrete the UV space by rasterizing it into pixels at a 
self-adaptive resolution (r,r), and examine the presence of q; 
within these pixels to determine their classifications and values. 
In simpler terms, pixels without any q; are considered holes, 
while pixels containing q; undergo further analysis to determine 
their values. 


5.2.1. Self-adaptive Resolution Estimation 

For convenience, we denote the corresponding coordinates 
of qi in the two-dimensional UV space as t;, where t; = (uj, vi). 
Based on empirical observations, we find that the local value 
of r should be closely associated with the local density of t;, 
denoted as p. Assuming that the global density of t; remains 
relatively constant, we propose a simple method to estimate it. 
Firstly, we construct a KD-tree to efficiently calculate the k- 
nearest neighbors of t; and sort them based on their distances 
from t;. Next, we select the median distance, denoted as m,, 
from this set of candidate distances as the density value of t;. 
Using this approach, we can determine the value of p: 
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1“ 
= il 0, 8 
p pael n] (8) 


where n; refers to the number of t;, which is also equal to ne. 
To calculate the value of r, we use the following formula: 


r= 


1 
= (9) 
p 


5.2.2. Pixel Values Calculation 

We construct a relative height map I(x, y), where (x,y) € 
[0,r) x [0,r). In this map, each pixel represents either a hole 
or records the relative signed height from surface S to point 
cloud C. This construction enables us to perform quantitative 
processing on the high-frequency components H. Pixels that 
do not correspond to any t; are considered as holes, denoted as 
Q, while for the non-hole region ®, we determine their pixel 
values based on Dp,, where Py € ®. 

To effectively highlight the local features of point clouds, we 
propose a strategy that selects the extremum value from D,, for 
each pixel. This strategy can be expressed as follows: 


Hole Np+ + Np- =0 
I(x, y) = max{Dp.x,)} Np+ >= Np- (10) 
min{Dp,x,y)} Np+ < Np- 


where Np+ and Np- represent the number of positive and 
non-positive values of Dp, in the pixel I(x, y) respectively. If 
I(x, y) € ®, we compare Np: and Np- within /(x, y) and select 
the value of Dp, with the highest magnitude if Np» is greater. 
Conversely, we choose the value with the smallest magnitude 
of D Pry: 


5.3. Height Map Inpainting 


After representing the high-frequency components using a 
height map, our objective shifts to inpainting the height map. 
This is achieved by solving the Poisson equation, guided by 
patch matching in the gradient domain. 


(a) Sampling 


(b) Generating new points 


Fig. 6. Illustration of the reconstruction process. (a) We generate a set of 
sample points denoted as a within the region Q, where blue pixels represent 
positive intensity and orange pixels represent negative intensity. (b) These 
sample points are mapped to S, and the intensity values / are offset along 
their respective normals. It is important to note that if the intensity is 
negative, the offset direction is reversed to ensure accurate displacement. 
New points are denoted as £. 


5.3.1. Constructing Poisson Equation 

After constructing L and H respectively, we transfer the task 
of point cloud inpainting to that of H inpainting represented by 
a height map. We begin with an intensity image /(x, y) and the 
desired gradients G,(x,y) and G,(x, y) inside the hole Q. We 
aim to inpaint a new intensity /*(x,y) within Q, subject to the 
constraint that J and /* agree on the hole boundary ðQ. In other 
words, the best 7* (x,y) of Q is the solution of the minimization 
problem: 


min I (x,y) — G(x, i 
eoa {fi (x,y) — G(x, y) (11) 
with F(x, y)laa= I(x, yla 


The minimizer must satisfy the associated Euler-Lagrange 
equation, and it is the unique solution of the following Poisson 
equation with Dirichlet boundary conditions: 


AF (x, y) = div(G,(x, y), Gy(x, y)) 


. k (12) 
with I(x, y)laa= I(x, y)laa 


where A is the Laplacian operator and div represents the diver- 
gence of a 2D guidance field. 


5.3.2. Patch-match Based Inpainting in Gradient Domain 

The objective is to obtain inpainted values G,(x,y) and 
G,(x,y) within the hole region Q, which will solve Equa- 
tion (LI. Typically, a well-structured patch is present in a 
known region of Jp, and we employ the patch-match method 
to find the optimal solution for G,(x, y) and G,(x, y). 

The core of the patch-match system revolves around the com- 
putation of patch correspondences. A Nearest-Neighbor Field 
(NNF) is a function f : J > R? that maps offsets to all possible 
patch coordinates in image /, utilizing some distance function 
between two patches. In this paper, we adopt the framework 
introduced by Barnes et al. to construct all NNFs of Jo. 
For each pixel Jo(x, y), we gather all NNFs that contain it, de- 
noted as Fy. By utilizing these NNFs, we can calculate the 
final G,(x, y) and G,(x, y). 


cœ»= ff Gr(x,y) (13) 


with Gr(x,y) € Fry 


5.3.3. Solving Intensity Image 

After obtaining the values of all pixels belonging to © in the 
gradient domain, the next step is to solve for the true values of 
Ip. We can use the known values of the pixels outside the hole 
pixels as boundary conditions for the reconstruction problem. 
The formula for this reconstruction is as follows: 


AP (x,y) = 


ôG, ôG, 
Ox (x,y) + By 


with I*(x, y)laa= I(x, y)laa 


(14) 


Then we can get discrete applications in the image domain: 
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(x+1,y)+ I(x- 1,y)+ (x,y + 1) + P(x,y- 1) 


_ Gy) _ Gy) 
~ Ox oy 


Al" (x,y) (15) 


For each pixel in the domain Q, we can obtain a matrix equa- 
tion of the form AX = B, where A represents the Laplacian op- 
erator of the image, B represents the divergence operator, and X 
represents the unknown /*(x, y). We can solve for all J*(x, y) by 
setting up these equations and solving a sparse linear system of 
them. 


6. Reconstruction From Decomposed Signals 


As shown in Figure [6] once all values of /*(x, y) within the 
hole region Q have been solved, we can proceed to recon- 
struct the point cloud signals using the complete signals H and 
L. Specifically, we sample points in the two-dimensional UV 
space, which is represented as the image in Section|5.2.2| These 
sample points, denoted as œ, are then mapped to the three- 
dimensional point cloud space using Equation (1). By calcu- 
lating the normals and intensities of œ, we generate new points 
by applying offsets to them within the three-dimensional space. 

To achieve a random distribution of a in UV space, satisfying 
a € [0, 1]’, we utilize the Halton sequence to generate sam- 
pling points within /*(x, y). Each sampling point can determine 
its corresponding coordinates on S(uq, Va) in three-dimensional 
space. Furthermore, the partial derivatives with respect to u and 
v represent the tangent vectors at S(u, v): 


ðS ðB; OBy OB, 

ðu du’ du’ ðu 

ðS ðB; OBy OB; 

ðv Av’ Av’ ðv 
In Equation (16), the first one represents the tangent vector in 
the u-direction, while the second one represents the tangent vec- 
tor in the y-direction. The normal vector 7,,, at S(u, v), is ob- 
tained by taking the cross-product of these partial derivatives, 
following the right-handed rule: 


(16) 


Baam (17) 
Sn X Fl 

Additionally, the intensity values can be obtained by perform- 
ing bilinear interpolation on neighboring pixels of the sample 
points. This enables a smoother reconstruction of the point 
cloud, resulting in improved coherence between the recon- 
structed point cloud and the original point cloud. Furthermore, 
we can obtain the normal vector 7, and intensity value J, 
through bilinear interpolation. By utilizing this method, we can 
successfully reconstruct the final point clouds: 


C= Stu, v) + Rav x Days (18) 


7. Handling Large-Scale Point Clouds 


In order to effectively handle Large-scale data of terrain point 
cloud, we employ a data preprocessing step prior to implement- 
ing our algorithmic framework. Due to the large number of 
points present in raw acquired data, directly fitting it to a B- 
spline surface would be time-consuming. To enhance the effi- 
ciency of the fitting process, we propose a data preprocessing 
step in which we downsample C using voxelization techniques. 
The downsampling scheme involves spatial discretization and 
gravity center extraction, as demonstrated below: 


1. Spatial discretization: In this step, C is split into voxels. 
Each voxel comprises a set of points {Po, p1,..., P,} (Bi € 
R3), where f; represents the coordinates of the i-th point 
in the voxel and N, denotes the number of points in the 
voxel. 

2. Gravity center extraction: For each non-empty voxel, we 
replace points inside this voxel with the gravity center p’ 
as shown in Equation (19): 


rasa (19) 


The downsampled terrain point cloud serves as the input for 
Section |4] 


8. Experimental Results 


In this section, we demonstrate the effectiveness of our algo- 
rithm through three sets of experiments. Firstly, section|8.3]il- 
lustrates that the low-frequency component generated can effec- 
tively represent the basic shape of the input model. Section]8.4] 
highlights the necessity and effectiveness of the high-frequency 
component in our proposed point cloud representation frame- 
work. Finally, in section we showcase our overall inpaint- 
ing results and compare them with other algorithms. 


8.1. Experimental Setup 


We evaluate the proposed method by testing it on seven real- 
world terrain scenes, coming from NatureManufacture and 
JianShan [35]. The holes in these scenes are introduced by: (1) 
the removal of unwanted objects such as trees and vehicles, and 
(2) deliberately creating holes on the terrain so that we can com- 
pare the hole filling results with ground truth. In Section [8.5] 
each of the terrain datasets with ground truth contains approxi- 
mately 100,000 point clouds, while the terrain datasets without 
ground truth consists of around 1 million point clouds. Addi- 
tionally, it is important to note that our focus solely lies on the 
positional information of point clouds, disregarding their col- 
ors. Consequently, the chosen terrain data uniformly considers 
elevation as the color component. 

In our experiments, we construct L by empirically using a 
4th-order (set p = 3 and g = 3) B-spline surface to ensure that 
the fitting result neither incurs excessive cost nor loses signifi- 
cant details. Increasing the number of control points can make 
L more refined, but this also implies higher costs. As a trade- 
off between accuracy and efficiency, we set both m and nas 19. 
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Thus, the uniformly spaced knot vectors u = (uo, u1, .. . , u22), 
v = (vo, V1, . . . , V22) can be obtained where u; = + and vj = 5 
with i, j € [0,22]. The iteration size to perform fitting opti- 
mization is J = 10. Additionally, during preprocessing, we set 
the ratio of voxel edge length to the longest axis length of the 
oriented bounding box (OBB) to 0.05. For image inpainting, 
we utilize a patch size of 11 and conduct 10 iterations of patch- 
match. 

Specifically, we compare our methods with several estab- 
lished traditional and learning-based methods, including one 
mesh-based traditional method Attene et al. (MeshFix), 
three point-based traditional methods, i.e. Doria et al. 
(DGI), Altantsetseg et al. (CSP), Shi et al. (NBFM) 
and three learning-based point cloud completion methods, i.e. 
Alliegro et al. (Deco), Xianget al. (Snowflake), lin et 
al. (HyperCD). 


8.2. Evaluation Metric of Inpainting Results 


We apply two evaluation metrics named GPSNR and NSHD 
to measure geometric differences between two point clouds ob- 
jectively. 

GPSNR measures the distortion between point cloud A and 
B with respect to a peak value: 


2 
GPS NR4.B = 1010819 ua (20) 
A,B 


where pg refers to the diagonal distance of a bounding box 
of the point cloud and e4,g represents point-to-plane distances. 
Therefore a higher GPSNR means a lower gap between two 
point clouds. 

NSHD (Normalized Symmetric Hausdorff Distance) is the 
normalized version of the One-sided Hausdorff Distance: 


1 
NS HD, pB = ymanOHDas, OHDz a} (21) 


where V denotes the volume of the oriented bounding box of 
a point cloud, and OH D4,g represents the One-sided Hausdorff 
Distance from point cloud A to point cloud B. Therefore, the 
smaller the NSHD, the more similar the two point clouds are. 
However, NSHD is closely related to point density, so compar- 
isons between different test scenarios may not be meaningful. 


ey “| 


(a) Input point cloud with hole (b) Low-frequency component repre- 


sented as B-spline surface 


Fig. 7. Illustration of the low-frequency component result. (a) presents 
the input point cloud model, which contains holes in its structure. (b) 
showcases the fitted B-spline surface obtained through our algorithm and 
NRMSE value between itself and (a). 


8.3. Results on Low-frequency Component Inpainting 

As depicted in Figure we present the obtained low- 
frequency component. From a subjective visual perspective, 
we leverage the flexibility and smoothness properties of the B- 
spline surface to accurately fit a shape that closely resembles the 
original point cloud model. Through this fitting process, we ef- 
fectively capture the overall shape and curvature characteristics 
of the original point cloud, ensuring a faithful representation of 
its global shape and curvature features. To objectively evaluate 
the accuracy of the fitting results and quantify the discrepancy 
between the fitted surface and the original signal, we employed 
NRMSE (Normalized Root Mean Square Error) as the evalu- 
ation metric. NRMSE is a value that ranges from 0 to 1, with 
values closer to 0 indicating a higher similarity between the pre- 
dictive model and the truth, while values closer to 1 suggest the 
opposite. In our specific case, the obtained NRMSE value of 
0.011 demonstrates that our low-frequency component closely 
approximates the input point cloud to a significant extent. 

DGI employs a representation similar to DEM, utilizing XY 
plane as the low-frequency signal. However, our approach to se- 
lecting the low-frequency component is more reasonable. Fig- 
ure |8] illustrates this by showcasing the generation of corre- 
sponding unrepaired high-frequency components. Subjectively, 
our height map exhibits more natural color variations, while the 
height map obtained by DGI displays significant differences be- 
tween neighboring pixels, indicating abrupt changes in eleva- 
tion and potential loss of elevation information. Objectively, 
an analysis of the number of projection points per pixel in the 
height map reveals a more balanced distribution of the original 
point cloud across our generated low-frequency signal. In our 
algorithm, the proportion of pixels with less than 10 projection 
points is approximately 95%, with a maximum of 54 projection 
points. In contrast, for DGI, these figures are 89% and 128, 
respectively. 


8.4. Results on High-frequency Component Inpainting 

Figure P] illustrates the comparison between the results ob- 
tained by inpainting only low-frequency component and those 
of a complete reconstruction, highlighting the effectiveness of 
high-frequency component. Subjectively, the inpainting re- 
sult using only low-frequency component appears excessively 
smooth due to direct sampling from the fitted B-spline surface, 
which lacks geometric details and fails to capture the natural 
characteristics of the terrain. Additionally, the low-frequency 
surface is theoretically limited in its ability to pass through 
all input points due to the constraints imposed by the num- 
ber of B-spline control points. Consequently, relying solely 
on low-frequency component leads to a lack of smooth tran- 
sition with the surrounding point cloud. In contrast, incorporat- 
ing both low-frequency and high-frequency components yields 
more natural results. By considering neighborhood informa- 
tion during the reconstruction of high-frequency component, 
the generated 3D point clouds exhibit a certain level of smooth- 
ness with the original points, without noticeable gaps along the 
boundary. 

Regarding the inpainting of high-frequency component, our 
strategy of solving the Poisson equation guided by the gradi- 
ent domain which is inpainted by patch matching, can generate 
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Fig. 8. (a) The height map based on the B-spline surface in ours. (b) The 
height map based on XY plane in DGI. (c) Distribution of the number of 
projection points per pixel in Ours. (c) Distribution of the number of pro- 
jection points per pixel in DGI. 


results that are closer to the ground truth. Figure [70] presents 
a comparison between our height map inpainting method and 
the random noise inpainting approach. From a subjective vi- 
sual perspective, our approach exhibits a resemblance between 
the inpainted height map and that of ground truth, as well as 
the final reconstructed point cloud. In contrast, employing the 
strategy of randomly filling the holes in the height map leads 
to a disordered point cloud representation. Additionally, we 
evaluate the differences between the two height map inpainting 
strategies and the ground truth using RMSE, and assess the fi- 
nal generated point cloud using GPSNR and NSHD. Our evalu- 
ation results consistently outperform those of the random filling 
approach, providing objective evidence of the superior perfor- 
mance of our height map inpainting strategy. 


8.5. Overall results and comparisons 


To facilitate numerical analysis, we test on some terrain mod- 
els with complex holes created artificially for the purpose of 
comparing our algorithm with others. For convenience, we test 
on several terrain point clouds, abbreviated as “Terrain”. A 
comparison with traditional algorithms is shown in Figure [IT] 
and Figure [12]demonstrates the error map to highlight the dis- 
tribution of errors. From a subjective visual perspective, our 
algorithm performs well in generating point clouds with uni- 
form density and natural characteristics around both complex 
boundary holes and no-well-defined holes. However, other al- 
gorithms encountered their problems. CSF fails to handle point 
cloud holes like TerrainO, which lack well-defined boundaries, 
due to the requirement for accurate extraction of hole bound- 
aries and sorting of boundary points. This limitation leads to 
highly erroneous results. The density of points inpainted by 


(a) Hole (b) Inpaint only by low- (c) Inpaint after recon- 


frequency component struction 


(d) B-spline surface (e) Height map with hole (f) Height map inpainted 


Fig. 9. Illustration of necessity of high-frequency component, in which in- 
painted points are marked in blue. (a) Original Point Cloud with Holes. 
(b) Inpaint result only using low-frequency component, which generated 
by sampling points on the surface. (c) Final result after applying both low- 
and high-frequency components. (d) Fitted B-spline surface. (e) Height 
map to be repaired. (f) Height map after filling holes. 


CSF may be uneven since it necessitates iterative convergence 
towards the interior of the hole. For instance, in Terrain] and 
Terrain2, the density at the center of the hole is lower than that 
at the hole’s edges. On the other hand, MeshFix detects and 
fills holes by using mesh representation, resulting in generated 
models that lack geometric details. Similar to the outcomes 
in Terrain! and Terrain2, it produces point cloud structures re- 
sembling cross-sections at the holes, while maintaining a very 
smooth appearance for inpainted points. Although DGI per- 
forms well in handling most cases, it tends to disregard infor- 
mation along the vertical axis, resulting in an unresolved hole 
region in Terrain3. Moreover, the adoption of a greedy march- 
ing strategy during the height map inpainting process leads to 
a noticeable lack of smoothness in the central region of holes, 
which is evident in Terrainl!. NBFM employs a cube-based di- 
vision of the point cloud space to perform inpainting. It selects 
appropriate cubes from the complete region of the point cloud 
to align with and replace the hole area. While this method ef- 
fectively fills small holes, it encounters convergence challenges 
when dealing with large hole areas, as shown in TerrainO and 
Terrainl. The difficulty lies in accurately estimating the cube 
size, as excessively small cubes accentuate local structural char- 
acteristics, while overly large cubes hinder point cloud align- 
ment. Moreover, NBFM relies on boundary point detection to 
determine the inpainting area, resulting in the introduction of 
additional boundary points if the accuracy of point cloud regis- 
tration is not satisfactory. 

From an objective perspective, we compare the results in- 
painted by ours and others with the ground truth and calcu- 
late two metrics: GPSNR and NSHD, respectively. As demon- 
strated in Table[I|and Table[2| our numerical values outperform 
those of other algorithms. Specifically, in terms of GPSNR, our 
method achieves an average score of 40.367DB, which is much 
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(a) Ground truth 


(b) Inpaint by our repair- 
ing scheme 


(c) Inpaint by noise 


in (e) Height map in ours (f) Height map with 
noise 


(d) Height map 
ground truth 
Fig. 10. Illustration of effectiveness of image inpainting in high-frequency 
component repairing strategy, in which a comparison is conducted between 
ours and random noise. (a) Ground truth. (b) Final result inpainted by our 
height map repairing scheme. (c) Final result inpainted by random noise 


in height map. (d) Ground truth’s height map. (e) Inpainted height map in 
ours. (f) Filled height map with noise. 


Table 1. PERFORMANCE COMPARISON IN GPSNR (DB) 
CSF MeshFix DGI NBFM Ours 


Terrain0 8.3976 17.036 31.550 30.207 40.036 
Terrain! 25.663 21.376 34.739 23.315 50.745 
Terrain2 12.128 12.054 20.629 17.341 40.044 
Terrain3 28.014 27.960 26.241 28.091 30.640 


higher than the others. Similarly, our solutions are also better 
in terms of NSHD, as we achieve lower values compared to the 
other methods. 

Additionally, we explore deep learning-based methods for 
terrain point cloud inpainting. Due to limitations in the size of 
neural network inputs, we need to downsample our point cloud 
data. Subsequently, we split terrain point clouds into cubes and 
remove certain neighboring cubes to automatically generate a 
partial dataset for training purposes. The results depicted in Fig- 
ure[I4]indicate that these methods cannot be directly applied to 
inpaint terrain scenes. This limitation arises from two key fac- 
tors: (1) these methods are typically designed for smaller-scale 
point cloud data and demonstrate inadequate performance or in- 
applicability when applied to larger-scale point cloud data, and 


Table 2. PERFORMANCE COMPARISON IN NSHD 
CSF MeshFix DGI NBFM Ours 


Terraind 149.40 38.264 20.544 33.250 9.0525 
Terrain! 59.728 74.045 38.221 56.110 21.319 
Terrain2 54.491 66.988 25.993 48.682 19.510 
Terrain3 0.3190 0.1310 0.2261 0.2584 0.1074 


(2) these methods are commonly tailored to address substantial 
missing data, typically accounting for 50% or more of the point 
cloud, whereas the missing parts in our scenes are relatively 
small. The challenge lies in accurately localizing these holes, 
which ultimately hampers the effectiveness of these methods in 
the hole-filling process. 

Moreover, we conduct a test in a scenario without ground 
truth, where holes are generated by employing a point cloud 
segmentation algorithm to extract and remove non-ground 
points. CSF and NBFM require the extraction of boundary 
points for the holes, which proves challenging in this specific 
scenario, resulting in their failure. Specifically, the former is 
unable to construct a complete boundary point loop that is es- 
sential for driving the inpainting process, while the latter, when 
substituting point cloud patches within the hole region, intro- 
duces additional boundary points as a consequence of imperfect 
registration, thus hindering the convergence of the inpainting 
process. As depicted in Figure [13] DGI exhibits significant in- 
formation loss in the height dimension, leading to the presence 
of numerous hole regions in the vertical direction. Addition- 
ally, the large gaps between these hole regions result in incom- 
plete mesh reconstruction when using MeshFix. In summary, 
our method surpasses others in terms of repair effectiveness and 
completion rate. 


9. Discussion 


CSF 
MeshFix 


Time Cost (s) 


Terrain 1 Terrain 2 Terrain 3 


Data 


Terrain 0 


Fig. 15. Comparison with other algorithms in terms of time cost. 


Certainly, our algorithm has its limitations. As illustrated 
in Figure it demonstrates disadvantages in terms of time 
complexity compared to alternative algorithms. This can be at- 
tributed to the significant time required for both B-spline fitting 
and solving the Poisson equation. The time complexity of fit- 
ting a uniform-parameterized cubic B-spline surface depends 
on factors such as the size of the point cloud data, the number 
of control points and iterations. Downsampling the point cloud 
provides an effective approach to accelerate the fitting process. 
The time complexity of solving the Poisson equation is influ- 
enced by the resolution of the height map, which is typically a 
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Fig. 11. We conducted tests on some models with ground truth available. We presented the input, ground truth, as well as the results of our algorithm and 


four other algorithms. 


CSF MeshFix DGI 
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Fig. 12. Illustration of Chamfer Distance: The points are color-coded based on their distance to the nearest point in the ground truth point cloud. Bluer 
colors indicate shorter distances, while redder colors indicate longer differences. 


high value to capture sufficient high-frequency information in 
large-scale scenarios. Additionally, solving large sparse linear 
systems using LU factorization has a time complexity of O(n). 


10. Conclusion 


To address the presence of complex and indistinct bound- 
ary holes in terrain point clouds acquired through 3D acqui- 
sition technology, we propose a novel representation method 
for terrain point clouds to facilitate their inpainting. The funda- 
mental idea is to decompose the point cloud signal into a low- 
frequency component represented by a B-spline surface and a 
high-frequency component represented by a relative height map 
generated from projecting discrete points to the surface. This 
transformation enables us to convert the point cloud inpainting 
problem into a surface fitting and 2D image inpainting problem. 
Subsequently, we combine the repaired high-frequency compo- 
nent with the complete low-frequency component to reconstruct 
new point clouds in 3D space. Additionally, we automate hole 
location based on the relative height map, eliminating the need 
for hole boundary extraction. 

We conducted tests on several real-world scenes to evaluate 
our method, comparing it with some classical existing methods. 
The results demonstrate that our approach is capable of recon- 
structing more detailed geometric information and effectively 
handling complex hole boundaries. 


Certainly, We have identified several areas for future work 
that warrant further investigation. Reducing time cost is part of 
our future research agenda, as we aim to address and overcome 
it. Considering the importance of these benchmarks, we also 
plan to further explore the application of our method on public 
release datasets in order to enhance its generalization. 
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